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SUMMARY 


It  is  suggested  that  problems  in  a reliability  context 
may  be  handled  by  a Bayesian  non-parametric  approach. 

A stochastic  process  is  defined  whose  sample  paths  may 

be  assumed  to  be  either  increasing  hazard  rates  or  decreasing 
hazard  rates  by  properly  choosing  the  parameter  functions  of 
the  process.  The  posterior  distribution  of  the  hazard  rates  are 
derived  for  both  exact  and  censored  data.  Bayes  estimates  of 
hazard  rates,  c.d.f.'s,  densities,  and  means,  are  found  under 
squared  error  type  loss  functions.  Some  simulation  is  done  and 
estimates  graphed  to  better  understand  the  estimators.  Finally, 
estimates  of  the  c.d.f  from  some  data  in  a paper  by  Kaplan  and 
Meier  are  constructed. 
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1.  INTRODUCTION 


Recently,  there  has  been  a good  deal  of  interest  in  nonpara- 
metric  Bayesian  approaches  to  statistical  inference.  In  this  ap- 
proach a stochastic  process  is  defined  whose  sample  paths  index  a 
large  family  of  distributions.  Then  conditional  on  a realization 
of  the  process,  i.i.d.  observations  are  taken  from  the  indexed 
distribution,  and  inferences  are  made  from  the  posterior  distribu- 
tion of  the  process.  In  this  manner  the  prior  probability  can  be  spread 
over  a very  large  number  of  distributions.  It  is  also  possible  to 
avoid  explicitly  specifying  the  functional  form  of  the  likelihood. 

The  most  common  approach  has  been  extensively  discussed  by 
Ferguson  (1973),  and  consists  of  using  a 'Dirichlet  Process'  prior. 

That  is,  a continuous  time  parameter  stochastic  process  whose  finite 
dimensional  increments  have  a Dirichlet  distribution  is  defined. 

One  can  then  assume  that  the  sample  paths  of  this  process  are  cumu- 
lative distribtion  functions.  Ferguson  shows  that  the  posterior 
distribution  of  the  process,  given  the  complete  observations,  is  also 
distributed  as  a Dirichlet  stochastic  process,  and  uses  this  posterior 
distribution  for  making  his  statistical  inferences. 

Antoniak  (1974)  considers  mixtures  of  Dirichlet  distributions. 
Doksum  (1974)  addresses  his  attention  to  prior  stochastic  processes 
that  are  'tailfree',  and/or  'neutral'.  His  posterior  distributions, 
however,  are  obtained  in  terms  of  expectations  over  the  entire  prob- 
ability space.  Susarla  and  Van  Ryzin  (1976)  were  able  to  obtain  the 
posterior  mean  of  censored  data  using  a Dirichlet  prior.  Recently, 
Ferguson  and  Phadia  (1976)  were  able  to  generalize  these  censored 
data  results  to  more  general  "neutral  to  the  right"  processes. 
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This  type  of  approach  seems  to  have  merit  concerning  statistical 
inference  in  a reliability  context.  What  is  proposed,  since  the 
concept  of  hazard  rate  plays  such  a key  role  in  statistical  reliability, 
is  to  place  the  prior  probability  over  the  collection  of  hazard  rates. 
This  is  done  by  defining  an  appropriate  stochastic  process  whose  sample 
paths  are  hazard  rates.  With  this  prior  we  derive  the  posterior  dis- 
tribution of  the  hazard  rates  for  both  right  censored  and  exact 
observations.  This  approach  has  the  advantage  of  placing  the  prior 
probability  strictly  on  absolutely  continuous  distributions  rather 
than  on  discrete  distributions  as  is  the  case  with  the  Dirichlet 
process  prior.  Moreover,  Bayes  estimators  of  the  entire  distribution 
under  natural  loss  functions  are  absolutely  continuous.  Finally, 
since  our  prior  random  c.d.f.'s  are  not  neutral  to  the  right,  the 
work  of  Doksum  (1974)  and  Ferguson  and  Phadia  (1976)  does  not  apply. 

2 • THE  EXTENDED  GAMMA  PROCESS 

We  shall  assume  throughout  that  our  distributions  have  posi- 
tive probability  only  on  the  nonnegative  half  of  the  real  line, 
although  one  could  adapt  to  distributions  over  the  whole  real  line. 

The  hazard  function  H(x)  of  a distribution  is  defined  to  be 

H(x)  = - In (1  - F(x) ) 

where  F(x)  is  the  left  continuous  c.d.f.  of  the  distribution  as 
in  Loeve  (1963).  (It  is  also  possible  to  work  with  right  contin- 
uous c.d.f.'s,  but  left  continuous  c.d.f.'s  are  computationally 
more  convenient  for  us.)  We  shall  refer  to  F(x)  = 1 - F(x)  as 
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the  survival  function  of  the  distribution.  Note  that  from  some 
point  on,  H(x)  may  equal  plus  infinity.  If,  for  all  x,  one 
can  express 

r(t)  dt  , 

[0,x) 

then  r(x)  is  called  the  hazard  rate  of  the  distribution.  Thus, 
r(x)  is  related  to  the  density  f(x)  by  the  relationship 

r(xl  = £-W-  , 

F(x) 

and  has  the  interpretation  that  rfx)A  is  approximately  eoual  to  tv>e 
probability  of  failure  in  the  next  A increment  of  time  given 
that  a lifetime  has  survived  until  time  x. 

We  denote  by  G(a,g)  the  gamma  distribution  with  shape 
parameter  a > 0,  and  scale  parameter  g > 0.  For  a > 0,  this 
distribution  has  for  its  density  with  respect  to  Lebesgue  measure, 


H(x)  = 


/ 


I 

I 

I 


g(x|a,6)  = xa  1 exp  (-x/g)  I (x)/r  (a)ga  , 

with  the  distribution  assumed  to  be  degenerate  at  0 if  a = 0 . 

Let  a(t) , t ^ 0,  be  a nondecreasing  left" continuous  real- 
valued function  such  that  a(0)  = 0,  and  let  g(t),  t s0  , be  a 
positive  right- continuous  real- valued  function,  bounded  away  from 
0 and  00  with  left  hand  limits  existing. 


L 


rTy  ■ — - ' — - ' 5 " 

| - 1 

Z(t) , t £ 0 , defined  on  an  appropriate  probability  space 
(ft.  p)  denotes  a gamma  process  with  independent  increments 
corresponding  to  a(t).  That  is,  Z(0)  = 0,  Z(t)  has  independent 
increments  and  for  t > s,  Z(t)  - Z(s)  is  G(a(t)  - a(s),  1). 

It  has  been  shown  (see  Ferguson  (1973)  that  such 
a process  exists  and  that  its  distribution  is  uniquely  determined. 

> We  assume  WLOG  that  this  process  has  nondecreasing  left  continuous 

sample  paths. 

We  now  define  a new  stochastic  process  by 

3 (s) dZ (s)  , 

[0,t) 

with  the  interpretation  that  for  almost  every  oj,  Z(t,w)  is  a 
nondecreasing  left  continuous  function  in  t bounded  on  every 
finite  interval,  and  r(t)  is  the  Lebesque  Stieltjes  integral, 
with  respect  to  that  function,  of  8(s)  over  the  interval  [0,t). 

We  say  a process  defined  in  this  manner  has  an  extended  gamma 
distribution,  and  we  denote  such  a process  by 


(2.1) 


r (t)  = 


/ 


r(t)  is  T (a  ( • ) ,£(•))  . 

The  finite  dimensional  c.d.f.’s  (or  densities)  of  r(t)  appear  to 
be  rather  intractable,  although  the  distribution  of  the  extended 
gamma  process  is  "nice"  in  many  ways. 

THEOREM  2 . 1 If  r(t)  is  distributed  as  r(a(*)»  6(*)),  then 
r(t)  has  independent  increments  and  for  fixed  t 
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(2.2)  the  characteristic  function  of  r(t)  in  some 

neighborhood  of  0 is  given  by 


*r(t)(0)  = exp  [-/  In  (1  - ig (s) 0) da (s)  I 

[0,t) 


(2.3)  Er  (t)  = J B(s)da(s),  and 

[0,t) 

(2.4)  Var  r(t)  = J g2(s)da(s)  . 

[0,t) 


PROOF: 


Let  0 = tp11^  < t^  <•••<  t^1(n)  be  a secluence  of  parti 


tions  whose  norm  goes  to  0 and  t^n)^00  as  n‘w>  Define 


(2.5)  rn(t)  = E | Z(tPJ)-Z(tJ"J)] 

{ i > 0 ;tj-n^  <t> 


where  if  the  index  set  is  empty,  we  assume  r(t)  = 0.  Then 

l 


n 


(t)-g-s*  > r(t)SOthat  rn(t)~^r(t)  and  (t)  (6)^r  (t)  (6) 


Also,  i|)r  (6)  = n 


UtW<t}S  3 zctj‘i)  J 


(n) 


)6) 


(n) 

= n (l-ig(tV  J)6)  J J 1 

(j ; t^n)  <t} 


exp  [ - E 


(o(tjnb'«(tj"J))ln(l-i6(tjnb0)] 
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The  independent  increments  follow  easily  by  letting  the  increment 
endpoints  be  contained  in  the  partition  points. 


Since  the  original  gamma  process  Z (t)  is  a pure  jump  process, 
the  extended  gamma  process  will  also  be  a pure  jump  process. 

3.  RANDOM  HAZARD  RATES 

Provided  a(t)  is  not  identically  zero,  we  may 
assume  that  the  sample  paths  of  an  extended  gamma  process  r(t) 
are  well  defined  nondecreasing  hazard  rates  corresponding  to 
absolutely  continuous  distributions.  Thus  the  conditional  distri- 
bution of  the  observations  X^,  ....  Xn  given  r(t)  will  be 
defined  by 

n 

(3.1)  P(X  > x , ....  X > x | r (t) ) = n exp  -/  r(t)dt]  a.s. 

11  n n i = l L [ 0 , x i ) J 

Of  course  (3.1)  and  the  distribution  of  r(t)  will  determine 
the  joint  distribution  of  X^ , ...»  X^,  r(t)  and  will  be  used  to 
derive  the  marginal  distribution  of  X^,  ...»  X^  and  the  posterior 
distribution  of  r(t)  given  the  observed  values  of  X1,...,Xn  . 

Since  the  sample  paths  of  the  r(t)  process  are  nondecreasing  functions 
a.s.,  we  are  placing  our  prior  probability  entirely  within  the 
class  of  distributions  with  nondecreasing  hazard  rates.  Later,  we 
will  show  how  the  prior  can  be  placed  over  distributions  with 


A 


nonincreasing  hazard  rates. 

In  assigning  a prior  probability  measure  by  this  method,  one 
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needs  to  input  the  functions  a(t)  and  B(t).  One  approach  con- 
sists of  defining  nondecreasing  mean  and  variance  functions  y(t) 
and  a2(t).  It  would  seem  reasonable  to  assign  as  y(t)  the 
best  "guess"  of  the  hazard  rate  and  use  o2  (t)  to  measure  the 
amount  of  uncertainty  or  variation  in  the  hazard  rate  at  the  point 

t.  Thus  a band  y(t)  ± 2o(t)  should  cover  most  of  the  "feeling"  for 

the  location  of  the  hazard  rate.  Assuming  y(t),  o2(t)  and  o(t) 
are  all  differentiable,  one  can  use  (2.3)  and  (2.4)  to  set 

y(t)  = I 3(s)a'(s)ds,  and 

[0 , t) 

o2(t)  = / B2(s)a'(s)ds  . 

[0 , t) 

Solving  for  a(t)  and  3(t)  yields 

(3.2)  3 (t)  = , and 

(3.3)  dot(t)  = dy  (t)  2 j do2  (t) 

dt  dt  dt 

which  then  determines  the  prior  distribution.  The  form  of  the 
posterior  distribution  gives  information  on  the  effect  of  the 
prior  and  may  help  in  choosing  a(-)  and  3(*)  • 

The  marginal  distribution  of  an  observation  X can  be  found 
from  (3.1)  with  the  use  of  a limiting  argument.  The  proof  of 
Theorem  3.1  is  given  in  section  7. 
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THEOREM  5.1  If  the  prior  over  hazard  rates  is  r(a(*),8(*)) 
then  the  marginal  survival  function  of  an  observation  X is  given 
by 


(3.41  F(t)  = P(X>t)  = exp[-  / ln(l+8 (s) (t-s) )da(s) ] . 

[0  ,t) 


The  marginal  survival  function  of  the  observations  X.,,..,X 
can  be  found  by  methods  similar  to  Theorem  3.1  and  is  given  in  the 
following  corollary. 


COROLLARY  3.1  If  the  prior  over  the  hazard  rates  is  r(a(*),  B(*))» 
then  the  joint  marginal  survival  function  of  n observations 


X!  ’ • 


X is 


n 


(3.5)  F(tj , . . . ,tn)  = P(X1>t1 Xn>tn)=exp[-  / ln(l  + B(s)  Z (s-t.J^daCsJJ 

[0 ,°°)  i = l 


where  a + = sup{a,0}  . 

Thus  the  marginal  survival  function  of  Y = min (X^ , . . . , X ) 
is  of  the  same  form  as  the  survival  function  of  just  X^  providing 
8(s)  is  replaced  by  n0(s). 

The  key  problem  in  any  Bayesian  setting  is  to  derive  the 
posterior  distribution.  Moreover  it  is  important  to  handle  cen- 
sored observations  since  reliability  data  are  often  of  this  type. 

If  an  extended  gamma  prior  is  used,  the  posterior  distribution  for 
right  censored  observations  is  also  an  extended  gamma  process.  The 
proof  is  given  in  section  7. 

THEOREM  3.2  If  the  prior  over  the  hazard  rates  is  P(ot(0>  8(0) 
then  the  posterior  over  the  hazard  rates,  given  m censored 
observations  of  the  form  X^  s x^,  X^  £ x^,  • • • » - xm 

r(o(*) , 0(*))  where 
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(3.6) 


6(t) 


g(t) 

m 

l+6(t)  l (x.-t) 
i-1  1 


The  effect  of  censored  observations  is  thus  to  lower  the  sample 
paths  to  the  left  of  the  censoring  points  while  leaving  the  values 
to  the  right  unchanged  which  appears  inherently  reasonable. 

We  next  address  ourselves  to  the  question  of  the  posterior 
distribution  of  r(t)  given  exact  observations.  The  answer  to 
that  question  is  given  in  the  following  theorem,  i.e.  that  the 
posterior  can  be  expressed  as  a continuous  mixture  of  extended  gamma 
distributions.  However,  the  dimension  of  the  mixing  measure  in- 
creases with  sample  size.  The  proof  is  given  in  section  7. 

THEOREM  3.3  If  the  prior  over  the  hazard  rates  is  r(a(*)>  $(•)) 
then  the  posterior  over  the  hazard  rates,  given  m observations  of 
the  form  X,  = x, , . . . , X = x is  a mixture  of  extended  gamma 
processes.  The  distribution  of  the  mixture  is  given  by 


(3.7) 


P (r (t) eB | X^=x^, . . . , Xm  xm) 

m~  n1  a m m 1(z.) 

/.../  .n^CV  FCB;rCo.*.iiiUi>_).6))ini  <*[“*.4^  '(xj.-jl  1 

[0,xm)  [0,x1)  


m A m m 

/•••/  " sup  n dTc  e 1 cz.,-5 ]Cli5 

i = l 1 i = l L 3 = 1 + 1 3* 


[0,xm)  [0,Xl) 


Here  F(B;Q)  denotes  the  probability  assigned  to  B e BR  by  a 
stochastic  process  which  is  distributed  as  Q,  g(  ) is  defined 
as  in  (3.6),  and  the  iterated  integrations  are  done  first  with 
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respect  to  z^  , then  , through  z 


m 


m 

E ^z  E 0 ' 

j=m+l  ^ j ’ ' 


Of  course 


The  complexity  of  this  distribution  makes  it  difficult  to 
see  how  an  observation  affects  the  posterior.  Close 

examination  reveals  that  a failure  at  time  x^  indicates  an  increase 

in  the  hazard  rate  prior  to  x^  . However,  this  increase  in  the 

hazard  rate  diminishes  as  one  looks  further  into  the  past.  This 

A 4.  _ 1 

is  evidenced  by  the  weight  function  8(t)  = 8 (t) [1  + 8 (t) (x^  - t)  ] 
in  the  mixing  integral.  The  above  effect  is  tempered  by  the  rate 

at  which  a(t)  increases  so  that  8(t)  and  a(t)  together  deter- 

mine where  and  how  the  increase  in  risk  (the  unit  jump  in  the  a 
function)  occurs. 


4.  DECREASING  HAZARD  RATES 

With  very  little  modification  the  work  done  for  increasing 

hazard  rates  can  be  applied  toward  decreasing  hazard  rates.  In 

particular,  let  a(*),  8(*)»  and  Z(*)  be  defined  as  in  section 

2 with  one  exception.  We  assume  that  ot(*)  and  8(‘)  have 

finite  values  at  plus  infinity  designated  by  a(°°)  and  8 (°°)  . 

We  require  that  a(°°)  £ a(t)  , t > 0 . Z (°°)  is  of  course 

G(a(°°)  , 1),  and  Z (°°)  - lim  Z(t)  is  independent  of  the  rest  of 

t-*°° 

the  process.  We  then  define  a decreasing  extended  gamma  process 

CDT(a(-),  by 

(4.1)  r(t)  =/  8 (s)dZ(s)  + 8(»)[Z(«)-lim  Z(t)]-/  6 (s)dZ(s)  . 

[t ,°°)  t+°°  [t,*l 
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With  this  definition,  r(t)  need  not  go  to  0 as  t goes  to  ». 
Integrals  w.r.t.  <*(•)  are  defined  in  an  analogous  manner.  We 
take  r(t)  to  have  non- increasing  left-continuous  paths.  As 
expected , 


(4.2) 


E r(t)  = / 3 (s)da(s)  , 

[t  ,co] 

Var  r(t)  = / 32(s)da(s),  etc. 

[t>  °°] 


If  one  then  uses  a DF(a(t),  3(t))  prior  over  the  failure 
rates,  essentially  all  the  distributional  results  of  Section  3 
carry  over  providing  we  replace  "extended  gamma"  with 
"decreasing  extended  gamma",  define  §(•)  differently,  and  make 
our  range  of  integration  be  [t,°°]  rather  than  [0,t)  . The 
following  theorems  will  be  stated  without  proofs. 

THEOREM  4. 1 If  the  prior  over  the  hazard  rates  is  DT(a(*)»  $(•))» 
then  the  joint  marginal  survival  function  of  n observations 
Xn  is  given  by 

n 

(4.3)  F(t, , . . . ,t  ) = P(X.^t1  , . . . ,X  ^t  ) = exp  [-  / ln(l+3(s)  I min  (t . ,s) ) da(s)  ] . 
1 n 11  n n [0,°°]  i=  1 1 

THEOREM  4.  2 If  the  prior  over  the  hazard  rates  is  Dr(a(*)»  B(*))» 
then  the  posterior  of  the  hazard  rates  given  the  n censored 
observations  X^  £ x^ , ...,  X^  s x^  is  Df(a(*),  6(*))  where 
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* 

I 


THEOREM  4.  3 If  the  prior  over  the  hazard  rates  is  DT(a(*)>  £(•))> 

the  posterior  of  the  hazard  rates  given  X.  * x. , ....  X = x can 
r 6 1 1 m at 

be  expressed  as  a continuous  mixture  of  decreasing  extended  gamma 
distributions,  i.e.  as 

(4-S)  PCUtJtBlXj-Xj Xm  - x.) 


/-/ 

[%»”] 

f " / 

(V“l 


m m m * m 

n 6(z,)F(B;Dr(a+  E If  .,6))  n d ct+  E 
i-1  1 i=l  lzi»  J i=l  l j=i+l 


m m 

n 6(z. ) II  d 
i=l  1 i=l 


a+ 


m 

E 

*i+l 


(Zj."] 


(zi) 


1 ,°°] 


(Zi) 


The  hazard  rate  estimation  discussion  in  Section  5 will  apply 
to  the  decreasing  case  provided  one  makes  the  obvious  changes  in 
the  various  expressions.  Similarly,  the  computational  results  in 
Section  6 can  easily  be  modified  to  handle  the  decreasing  hazard 
rate  situation. 

5.  BAYES  ESTIMATORS 

(a)  Estimation  of  hazard  rates. 

A natural  loss  function  to  be  used  when  estimating  a hazard 
rate  is  the  generalization  of  squared  error  loss  given  in  Ferguson 
(1973).  Thus  our  loss  function  will  be 

(5.1)  L(r,r)  = / (r(t)  - r(t))2dW(t) 

[0,“) 

where  W is  an  arbitrary  finite  measure  on  [0,°°)  such  that 

/ / B2(s)da(s)dW(t) 

[0,»)  [0  ,t) 


< » . 


I 
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In  finding  r(t)  which  minimizes  the  expected  loss,  we  may 
interchange  the  order  of  integration  and  thus  minimize 


E(r(t)  - r (t)  ) 


for  a fixed  t.  The  Bayes  estimator  is  given  by  the  posterior  mean 
of  r (t) . 

If  we  ignore  censored  observations,  we  may  use  the  form  of 
E r(t)  in  (2.3)  and  the  fact  that  the  mean  of  a mixture  of  distri- 
butions is  the  mixture  of  the  means  (assuming  existence)  to  express 
the  Bayes  estimator  of  r(t)  as 

(A  * C In  » |n  \ n B (z . ) n d[a(z.)+  Z I (^>1 

tO,xn)  [0,xL)  [0,t).  = 0 i i*0  1 j=i+l(2 


(5.2)  ?(t)  * 

/.  . . / n 6U:)  n d[a(z.)+  I I (fi)] 

[0^)  [0,Xl)i=l  1 i=l  1 j*i+l  <z  ®) 

where  the  iterated  integrals  are  integrated  with  respect  to  zQ, 
z^,  z2,  etc.,  respectively. 

Note  that  the  denominator  is  of  the  exact  same  form  as  the 
numerator,  though  the  integral  is  of  a smaller  dimension. 

Obviously  r(t)  is  a nondecreasing  function  of  t as  expected. 

A # 

Including  censored  observations  would  only  modify  the  3 function. 

While  some  approaches  of  nonparametric  hazard  rate  estimation 
require  the  use  of  an  arbitrary  window  function  w(*)  of  integral 
one,  this  approach  is  free  of  any  such  function. 
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It  would  appear  that  the  utility  of  this  estimate  is  severely 
limited  since  it  involves  a multi- dimensional  integral.  We  shall 
show  in  the  next  section,  however,  that  r(t)  is  expressible  in 
a manner  that  involves  only  one-dimensional  integrals. 

If  the  prime  consideration  is  predictive  in  nature,  the 
solution  is  different.  Suppose 

(5'3)  = p(xn+i  - * I xi  xi ^ = 

denotes  the  conditional  survival  function  of  a future  observation 
given  n current  observations.  Then 

(5-4)  F*Ct)  = E P(Xn,1  * t | r(t),  X„=xn) 

r(.tj  |x1,...  ,xn 

where  the  expectation  is  with  respect  to  the  posterior  distribution 
of  r given  Xj  = xl , . . . , = x^.  Since  the  X^s  are 

conditionally  i.i.d.,  this  is  equivalent  to 

E exp  [-  / r (s)  ds] 

(5.5)  r(t)|x1,...,xn  [0,t) 

which  is  the  posterior  mean  at  t of  the  random  survival  function 
F(t)  defined  from  r(t)  by  F(t)  = exp[-/^0  t^r(s)ds].  Thus  F*(t) 
can  be  thought  of  as  the  Bayes  estimate  of  the  survival  function  for  the 
squared  error  loss  function 


(5.6)  L(F,F)  = / (F(t)  - F(t))2d  W(t) 

[0  ,») 

where  W(t)  is  a finite  measure  over  [0,°°).  Including  censored 
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observations  only  changes  the  form  of  the  g function. 

To  find  a closed  form  expression  for  F*(t),  let 
f f n A n n 

(5.7)  O J * * ‘J  n 6(2,)  n d[a(zi)+  l I UJ] 

[0 , x ) (0,Xl)i*l  1 1=1-  j-i+1  (z.,®) 

denote  the  norming  constant  in  the  posterior  of  r(t)  . Since  the 
posterior  is  a mixture  of  extended  gamma  distribution,  we  may  use 
Theorem  3.1  to  obtain 


I 


(5.8) 


F*(t)  = 


iff11/.  n n n 

- J J n g(z Jexpf.f  in(l+g(zn)(t-z  ) + )d(a(z  )+  Z 1 (z  ) | n d[a(z.)+  l I (z  )] 
C[0,xn)[0,Xl)i=l  1 [o,oo)  0 0 0 i = l (2i  ,«)3i  = l j-i+l(zj,i) 


The  integrand  can  be  evaluated  as 


n ^ n 

(5.9)  n g(z.)exp [-/  ln(l+g(z  )(t-z0)+)da(zQ)-  l ln(l+g(z.) (t-z.)+] 
i=l  [0,°°)  i=l 


n g(z. ) 

= exp [-/  ln(l+g(z  )(t-z  )+)da(z  )]  II 

[0,-)  ° ° 


0 J i=l  l+g(z.)(t-z.)+ 


Noting  that  our  first  factor  is  free  of  , and  relabeling 


(5.10) 


B*(z.) 


B(zi) 


■5  * a 4*  * 

l+eUi)  (t-zj) 


we  obtain 
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(5.11)  F* (t)  = exp  [ - / ln(l  + g(zQ) (t-zQ)  + )da(zn)] 


[0,®) 


n n n 

/•  • • / n 8*(z1)  n d [a (z  . ) + z I (z. 
0,xn)  [0,x^)  i-1 i«l  1 j=i+l(Zj,°°) 1 


[0 


/.  . . / 
[0 


) 1 


n 


n 


• • • J n g(z-)  II  d(o(z.)  + E I (z.)] 

,xn>  i«l  j=i+l  (z  j ,<*>) 

Similarly,  Corollary  3.1  can  be  used  to  express  the  joint  survival 
function  of  k future  observations  *n  + l ’ ' ' ’ ’ Xn+k  conditional 
on  the  observed  data.  Thus 


(5.12) 


Fft  , ) = P(X  >t  X , > t ,|X=x.,...,X  =x) 

v n+1  n+k'  v n+1  n+1  ’ n+k  n+k  1 1 n 


is  of  the  same  form  as  (5.11)  with  (t-z.)+ 
n+k  + 

replaced  by  I (t.  - z.)  , One  of  the  consequences  of 
j =n+l  J 

this  is  that  the  minimum  of  k future  observations  has  the  conditional 

/V  A 

survival  function  given  in  (5.11)  with  g replaced  by  kg  . 

Noting  in  (5.11)  that  g*  is  a nonincreasing  function  of  t 
which  is  equal  to  g when  x = 0 guarantees  that  F*(t)  is  a 
bonafide  survival  function.  The  first  factor  of  F*(t)  in  (5.11) 
would  be  the  survival  function  of  a future  observation  were  the 
observations  censored  at  x^,...,x  rather  than  observed.  Thus 
the  second  factor  contains  the  information  gained  by  observing 
"deaths"  rather  than  "losses"  (see  Kaplan  and  Meier  (1958)  for 
elaboration  on  this  terminology). 


! 
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Note  that  F*(t)  is  differentiable.  By  using  the  product 
rule  for  derivatives,  interchanging  derivatives  and  integrals,  and 
interchanging  the  order  of  integration,  it  can  be  shown  that  the 
density  corresponding  to  F*(t)  is  given  by 


(5.13) 


f*(t)  = exp [ - / ln(l  + B(t-zn)+)da(z„) 

[0,")  0 0 


I • • 
[0*^) 


r r n n n 

/ / n b* (z . ) n drafz. ) + z i (r  n 

ro,x1)[0>x)i=0  i;i  = o 1 ^ j=i  + l Czj’“0C  i)] 


[0 


/ • • • / ,.n  B Cz.)  n d [a (z  . ) + I I .(z.)] 

»*n)  [0,x1)i  = l 1 i=i  1 j = i+1  (Zj»“)  1 1 


Moreover,  if  we  define  the  random  density  function  by 

(5-14)  f(t)  = r (t)  exp  [ - / r(s)ds]  = --4-  Fft)  , 

[0,t)  dt 

then  by  interchanging  differentiation  and  integration  over  the 
posterior  distribution  we  have 

(5.15)  f*(t)  = -A  E (F  (t)  ) = E(-^  F(t)  ) = E (f  (t)  ) 

so  that  f*(t)  is  the  Bayes  estimate  of  the  density  with  the  usual 
type  loss  function, 

L(f,f)  = / (£ (t)  - f (t))2d  W(t)  . 

[0,t) 


(5.16) 
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This  suggests  an  approach  to  density  estimation  which  gives 
smooth  continuous  estimates  and  avoids  the  problem  of  defining 
window  functions  as  in  Rosenblatt  (1971). 

Finally  , we  may  obtain  the  failure  rate  r*(t)  corresponding 


to  F*(t)  as 


(5.17) 


Ct)  -gisi- 


F (t) 


/ •••  / •••  / n b*(z.)  n [do(z.)  + z ir  mAz.)] 

>,xn)  [ 0 , x x ) [0 , t)  i = 0 1 i=0  1 j-i  + 1 lV  > 1 

/ •••  / n B*(z.)  n [da(z.)+  Z If  >(z  ) 

[0,xn)  [0,xx)  i=l  i=l  j=i+l  lzj’  J 


However,  this  is  the  same  expression  as  in  (5.2)  with  the  exception 
* ^ 

that  the  8's  are  replaced  by  8 's.  In  other  words,  the  effect  of 
using  the  loss  function  over  the  c.d.f.'s  (5.6)  when  estimating  a 
distribution  (be  it  c.d.f.,  density,  or  hazard  rate)  at  a point  t 
rather  than  the  loss  function  over  the  hazard  rate  (5.1)  is  merely 
to  act  as  though  one  has  an  additional  censored  observation  at  the 
point  t . 

If  one  is  interested  in  estimating  the  mean  of  the  distribution 
in  question,  then 


(5.18) 


^ = TO  ~)FCt)dt  E rl  ,6XP["  / r(s)ds]  dt 
Id,  ) [0,°°)  [0,t) 


is  a well  defined  random  variable  providing 


/ exp[-  / ln(l+B(s)  (x-s  jlda(s)  ]d.t<  » 

[0,»)  [0,t)  V ' 


Taking  expectations  with  respect  to  the  posterior  distribution,  the 
mean  of  the  estimated  survival  function  F*  , 

(5.19)  y*  = / F*(t)dt  = / E(F(t))dt  = E(y) 

[0,oo)  [0,00) 


is  the  Bayes  estimate  of  y 


under  the  usual  loss  function 
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6.  COMPUTATION  AND  SIMULATION. 

The  presence  of  the  multi -dimensional  integral  which  occurs  in 
our  estimates  would  appear  to  make  computation  extremely  difficult. 
The  following  theorem  enables  us  to  work  with  integrals  of  only 
one  dimension.  The  integrands  are  powers  of  the  B function  and  the 
integration  is  with  respect  to  the  a measure. 

THEOREM  6 . 1 Assuming  that  a(*)  and  B(*)  are  defined  as  in 
Section  2,  then 


(6.1) 


n 


/*  * • / n B (z  . ) 

[0,Xn)  [O^Ji-l  x 


n n 

n d [a  (z  . ) + £ I (z  . ) ] 

j=i+l  (Zj,°°)  1 


- £ k(e)  [ n J a (t)Sida(t)  ] 

2 (i,e.>l}  [0,x. ) 


where  0<x  sx  1<...<x1<«,  the  sum  is  over  all  vectors 
n n - 1 I 

g = (e^,  ...»  e ) of  non-negative  integers  such  that 


J 


n 


(6.2) 

k 


Z e . ^ j,  j = 1,  . . . , n - 1 ; I e.  = n;  and 
i=l  1 i=l  1 


j-1 


(e)  = n j-1  = n [(j-l)-  E e ]!/[j-£  e ]! 

~ ( j »e  • ^2 } [(j-l)-Ee  ]Pe  (j,e  ^2}  1 1 1 J 

J 1 ± J ± J 

where  nPr  denotes  the  number  of  permutations  of  n things  taken 


1 

1 

H 

l 


r at  a time. 
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PROOF : Since  the  inside  integral  can  be  expressed  as  a sum  of  n 

integrals,  the  next  as  a sum  of  n-1  integrals,  etc.,  it  is  clear 
that  (6.1)  can  be  expressed  as  a sum  of  n!  integrals.  Moreover, 
since  we  assume  the  x^'s  are  ordered  in  decreasing  fashion,  ii. 
must  be  the  case  that 


(6.3)  / 3 ( z . ) dl  (z.) 

[0  »x. ) 1 (z  ,»)  1 


= 8(z.)  for  z.  < x.  < x.,  j > i. 
3 J 3 i 


This  will  then  combine  with  8(z^)^  to  give  §(z^.)^+^  for  some 

integer  k.  Close  scrutiny  will  reveal  however  that  the  exponent 
~ k 

k of  8(Zj)  can  never  exceed  j.  Moreover,  each  of  the  n! 
integrals  will  be  of  the  form 


(6.4) 


II  / 8(t)  ida(t) 

(i,ei^i)[0,xi) 


where  l e.=n,  the  e.  being  non-negative  integers.  Thus  to 
i=  1 1 1 

establish  Theorem  7.1,  we  need  only  argue  that  k(e)  correctly 

counts  the  number  of  terms  of  the  form  given  in  i6.4). 

Consider  a vector  e = (e^ en)  of  the  form  specified 

in  the  statement  of  the  theorem.  Fix  j and  assume  that  e^  ^ 1. 

Then  one  unit  of  the  exponent  of  §(Zj)ej  must  come  from  the  j 

integration,  and  the  other  e.-l  units  must  come  from  previous 

J 3-1 

integrations.  Since  there  will  be  (j-l)  - E e.  previous  mte- 

i = l 1 

grations  unaccounted  for,  there  are 


(6.5) 


3-1 

(j-l)  - Z e 
i = l 1 


e.-l 

3 


ways  of  choosing  the  required  e^  - 1 integrations.  Moreover,  the 
first  chosen  integration  can  increase  the  exponent  in  e^  - 1 ways 
(by  being  routed  to  any  of  the  other  integrations  which  eventually 
contribute  to  e^),  the  second  chosen  integration  can  increase  the 
exponent  in  e^  - 2 ways,  etc.  Thus  we  need  to  multiply  (6.5) 
by  (e^  - 1) ! to  count  how  many  ways  we  can  obtain  the  necessary 
exponent.  Using  the  multiplication  principle  then  to  count  the 
total  number  of  terms  (6.4)  for  a given  vector  e gives  us  k(e). 


Consider  the  very  specialized  case  where  a(*)  jumps  at 


0 and  is  then  flat.  That  is 


a (0)  = 0 

a(x)  = a,  x > 0. 

In  this  case,  r(t)  will  be  a constant  function  whose  value  will 
be  a G(a,  8(0))  random  variable.  Thus  the  only  value  of  8(t) 
that  matters  is  8(0)  = 8.  Since  the  parameter  in  an  exponential 
distribution  is  just  its  constant  failure  rate,  this  is  equivalent 
to  putting  a G(a,8)  prior  over  the  parameter  0 of  an 

exponential  density. 

Then  if  we  have  complete  observations  at  xi>,,,»xn  and  censored 

observations  at  x J ,,..., x t , we  can  specify  our  posterior  distribution 

n+1  n+m  * 

of  r(t)  from  Theorems  3.2  and  3.3.  Since  the  posterior  of 
an  exponential  distribution  with  a gamma  prior  is  again  a gamma, 
the  distribution  of  r(tp)  , tQ  > 0 specified  by  the  mixture  in 
Theorem  3.3  must  also  be  a gamma  distribution. 

In  this  event,  the  Bayes  estimate  of  r(t)  , t > 0,  is  a 


w 


I 
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constant  (free  of  t)  and  may  be  expressed  in  terms  of  Theorem  6.1. 

Let  #e  denote  the  number  of  non- zero  components  of  e.  Then 
from  Theorem  6.1,  the  numerator  of  r(t)  equals 

3 (0)n+1  E k(e)  a 


n+1  i 


= 3 (0) 


E a1  E k(e)  . 


i=  1 {e  ; #e=i  } 


However,  it  can  be  shown  that  E k(e)  is  the  coeffi- 

{e;#e=i} 

dents  of  Z1  in  Z (Z  + l)  (Z  + 2) . . . (Z+n)  (the  modulus  of  Sterling 
numbers  of  the  first  kind).  Thus  the  numerator  of  r(t)  equals 


1+3  E x. 
i=  1 1 


a (a+ 1) . . . (a+n)  . 


By  similar  treatment,  the  denominator  of  r(t)  equals 


3 

n+m 

1+6  E X. 


a (a+ 1) . . . (a+n- 1) 


Thus 


r(t)  = 


1+3  E x^ 


(a+n)  = 


C<»+n] 

n+m 


•t  n**-  m \ 

a + 5 xi) 


, t > 0 


This  agrees  with  the  posterior  mean  for  uncensored  data  given  in 
Mann,  Schafer,  and  Singpurwalla  (1974)  (see  page  414).  As  one 
would  expect,  as  n-*»  , 

r(t)~  [total  no.  of  fai lures ]/ [total  time  on  test]. 
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In  order  to  observe  the  performance  of  our  Bayes  estimators , 

samples  from  Weibull  and  exponential  distributions  were  taken  and 

the  corresponding  Bayes  estimators  computed.  In  all  cases  the 

sample  size  was  11  and  the  prior  parameter  functions  a(t)  = t , and 

3(t)  E 2 were  used.  Thus  the  expected  value  of  the  prior  hazard 

rate  would  be  J 8(s)da(s)  = 2t.  This  is  the  hazard  rate  of 

[0,t) 

a Weibull  distribution  with  mean  .8862.  All  observations  were 

complete  (not  censored).  It  is  true  that  if  one  decreases  £(•) 

and  increases  a(*)  in  such  a way  that  the  mean  of  the  prior 

/ 8(s)da(s)  is  unchanged,  the  variance  of  the  prior  / B^(s)da(s) 

10, t)  [0,t) 

will  be  decreased.  This  has  the  effect  of  specifying  a more  precise 

prior  distribution  and  hence  the  prior  will  have  more  influence 

in  posterior  estimates. 

Bayes  estimates  are  computed  under  both  loss  functions  i.e. 
squared  error  loss  on  hazard  rates  and  c.d.f.'s.  The  hazard  rate 
corresponding  to  the  Bayes  estimate  of  the  c.d.f.  is  graphed  along 
with  the  estimated  hazard  rate  on  the  hazard  rate  graphs  for  the 
• purpose  Of  POmpTarrf son.  Similarly,  the  c.d.f.  corresponding  to  the 
Bayes  estimate  of  the  hazard  rate  is  graphed  on  the  c.d.f.  graphs. 

Thus  on  figures  1-6,  the  posterior  Bayes  estimate  of  the  hazard  rate 
is  denoted  by  a solid  line,  while  the  hazard  rate  which  corresponds 
to  the  posterior  Bayes  estimate  of  the  c.d.f.  is  denoted  by  the  line 
made  up  of  alternate  dashes  and  plusses.  Since  the  key  is  the  same 
for  all  graphs,  it  is  stated  explicitly  only  in  Figure  1.  A similar 
interpretation  holds  for  figures  7-12  concerning  the  c.d.f.'s.  Thus 


n] 

■ , 

the  distributions  corresponding  to  the  solid  (plus-dash)  lines  in 
figures  1-6  are  the  same  as  the  distributions  corresponding  to  the 
alternating  plus-dash  (solid)  lines  in  figures  7-12  respectively. 

Figures  1,  3,  and  5 depict  the  Bayes  estimates  of  the  hazard  rates 
when  the  random  samples  come  from  Weibull  distributions  whose  failure 
rates  are  respectively  t,  2t,  and  3t.  Note  that  the  estimates  reflect 
the  populations  from  which  the  samples  come  by  generally  having 
progressively  steeper  slopes.  (Note  that  the  scales  change  between 
graphs  so  that  visual  slopes  are  deceptive.)  Figures  2,  4,  6,  depict  the 
Bayes  estimates  of  the  hazard  rates  when  the  samples  come  from 
exponential  distributions.  In  each  case  the  mean  of  the  exponential 
is  made  to  be  the  same  as  the  previous  Weibull  distribution. 

The  purpose  of  this  is  to  see  if  our  estimated  hazard  rates 


I 


will  reflect  the  difference  between  Weibull  and  exponential  distri- 
butions. Note  that  in  each  case,  the  estimated  hazard  rates  are 
flatter  for  the  exponential  distributions  than  for  the  Weibull 
distributions.  Of  course  since  exponential  distributions  are  on 
the  boundary*' '(bur  'j?ribr  puts  all  the  probability  on  nondecreasing 
hazard  rates) , our  estimates  of  the  hazard  rate  will  necessarily 
be  increasing  to  some  degree.  Figures  7-12  give  the  estimates  in 
terms  of  c.d.f.'s  rather  than  hazard  rates.  They  are  of  course 
based  on  the  same  samples  used  in  Figures  1-6.  Finally,  Figure  13 
and  14  depict  estimates  of  the  hazard  rate  and  c.d.f.  for  the  data 
given  in  the  Kaplan  and  Meier  (1958)  paper.  The  prior  was 
arbitrarily  taken  to  be  a(t)  = t,  3 (t)  = 4 and  the  starred  lines 


J 
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indicate  cencored  values.  Note  that  a slight  peaking  occurs  in 
estimates  of  the  hazard  rate  at  complete  observations,  although 
this  peaking  is  scarcely  detectable  in  the  c.d.f.'s.  In  comparing 
these  graphs  with  the  estimates  of  the  c.d.f.  given  in  the  papers 
by  Susarla  and  Van  Ryzin  and  Ferguson  and  Phadia,  it  appears  that 
our  estimate  is  closer  to  the  Kaplan-Meier  product  limit  estimate 
than  theirs.  We  feel  that  our  continuous  estimates  are  more 
appealing  thcin  their  discontinuous  estimates. 

In  conclusion,  it  appears  that  our  estimates  have  the  property 
of  being  smooth  and  continuous  and  yet  are  very  responsive  to  the 
data. 


7.  PROOFS  OF  THEOREMS 


In  this  section  we  take  our  stochastic  processes  to  be  defined 
on  an  arbitrary  probability  space  (ft,  F,  P) . We  use  RR  to  denote 
the  set  of  all  nonnegative  functions  on  the  nonnegative  real  line 
R and  Br  to  denote  the  smallest  0-algebra  generated  by  sets  of 


the  form  {x(*)e  R : x(t1)e  1^...,  x(tk)e  1^}  where  t^...,  Ik 


are  intervals  in  R.  A stochastic  process  r is  a measurable 


function  which  maps  into  R . This  induces  a probability  measure 


on  (R  , Br)  called  the  distribution  of  the  process  r.  Since, 
with  probability  one,  the  sample  paths  r(t,u>)  of  our  stochastic 
process  are  failure  rates,  we  can  define  a probability  measure  P 


on  the  product  space  (R  x R,  8 * B)  by  extending  P(B  x q)  = 

R 


/FJC)  dP(u>)  to  the  usual  product  c-algebra  of  B^  and  the  Borel 


sets  B.  Here  A = r ( B ) and  F (C)  is  the  probability  assigned 


I 
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to  C by  the  distribution  corresponding  to  r(*,aj).  Then  a probability 


measure  on 


(R,  8)  is  determined  by  P{C)  = P(RR  x c)  = 


/ FW(C)  dP(w)  V C e B . The  posterior  distribution  of  the  process 

for  a single  observation  is  a function  4>(*,  •)  : 8 x R — ► (0,  1] 

Borel  measurable  in  the  second  argument  when  the  first  argument  is 

fixed  such  that  for  each  fixed  x c R,  <M*,x)  is  a probability  measure 

on  (RR  , 8 ) and  / <j>(B,  x)  dP(x)  = P(B  x c)  for  all  B e B and  C e 8. 

C R 

The  extension  for  several  observations  is  straightforward. 

For  convenience  in  writing  we  adopt  the  following  notation: 

(i)  g(x;  a , 8)  = xa_  lexp  (-  x/0  ) I [0  #oo)  (xJ/HaJa6;  g(x;a)  = g(x;a,l)  . 

(ii)  Ao^  = a(t[n))-a(t{"})  , i = 1,  k(n) 

(iii)  = 6(t[nJ  , i = 1,  ...,  k (n) 

Mn)  1 k (n ) 

(iv)  I = T.  and  n = n 

i-1  i= 1 

(v)  B (u,8,x,y)  = { (u  ,.  ..  ,ir  )eRk{n)  ; z 8 . u >y  , I Bju,  >y.  , . . . , 

t;<T,  1 1 t . ^t . <t_  2 


i 1 


■1"  l 2 


E 8.u.  > y }.  Often  B (u,0,x,y)  is  abbreviated  B (u,8). 
xk_1<ti<Tk1  1 K n - ~ ~ n " 

(vi)  F ( B ; Q ) = Probability  assigned  to  B e 8 by  a stochastic 

R 

process  with  distribution  Q. 


LEMMA  7.1 

Let  a ( • ) be  a nonnegative  nondecreasing  left  continuous 
function  on  [0,  °°)  with  a(0)  = 0.  For  a sequence  of  partitions 

0 = tg  < t^  < . . . < < 00  whose  norm  goes  to  zero  and  whose 
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upper  end  point  goes  to  infinity,  define  an(0)  - 0 and 

k(n) 


k(n) 

a (t)  = Z a ( t . ) I 
n i=l  1 "'i-l,'*i 


x (t.  „U(tl  + •'*ktolII(« -l'11'  tE<0'") 


If  we  define  D and  r^(t,<o)  as  in  (7.6)  and  (2.5)  respectively, 


A = r 1 (B) , and  An  = r"1^)  , then 


(7.1) 


f i (to)  dP  (to)  — ► / I (u)  dP(«) 
A An 


i.e.  F (B ; r(an,B)>  F(B;  r(a,B))  and 


(7.2)  lim  / n g(ui;Aai)dui  = F(B;r(a,8)) 

n-*-°°  B (u,0) 
n 


PROOF 


: For  almost  all  w,  r ( x,w  ) -*r  (x  ,u  ) uniformly  on  [0,  t], 


n 


0 < t < » since  r (t , to)  and  r(x,  u ) are  almost  surely  non- 
decreasing left  continuous  bounded  functions  on  t e [0,  t] . Thus 

t i (to)  and  (7.1)  follows  by  LDCT.  Note  that 

A_  ' 1 A 


(7.3)  / I (to ) dP(to)  = F (B  ; T (a  ,8))  = / 1 1 g(u.;Aa.)  du. 

n n n B (u,B)  111 

n 


Thus  the  l.h.s.  of  (7.2)  exists  and  is  equal  to  F (B  ; T(a,6)) 


PROOF  OF  THEOREM  3.1;  Define  r (x)  as  in  (2.5).  As  noted  before. 


Z(t,«)  is  nondecreasing  and  left  continuous  for  almost  all  to. 
Hence  r(x),  x e [0,t]  is  bounded  almost  surely  and  r (x)— * r(x) 


n 


for  each  x e [0,t].  Thus  / r (x)  dx 

- n 


for  almost  all  to  e ft  . 


[0,t) 


/ r(x)  dx  (by  LDCT) 
[0  ,t) 


— ^ 


H 


1 
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Now,  P(X  > t)  = 


(7.4) 


(7.5) 


/ P (X>t|  r (•  ,w  ) ) dP  (u>) 
ft 

= J exp  [-  / r(x)  dx]dP(a>) 

Si  [0 , t) 

= lim  / exp  [-  / r (x ) dt  ] dP  (w  ) by  LDCT 

n-x»  ft  [0,t)  n 

= lim  / exp[-Z(t-t.  )+ei{Z(ti)-Z(t._1)  }]dP(w) 
n-*°°  ft  1 1 

= lim  / exp  [-1  (t-t . ) +3  . u.  ] II  g(u.;Aa.)du. 
n~*>  Rk(n)  ill  ill 

= lim  11(1  + (t-ti)+Bi)  Aai 

n+oo 


= exp  [-  / In (1+6 (s) (t-s) ) da (s) ] . 

(0,t) 


PROOF  OF  THEOREM  3.2:  First  consider  the  case  m = 1.  Define 

B e S by 
R 


(7.6)  B = {r  ( • ) eRR:  r (x ^ >y ± , r (t2 ) -r  (x  ^ >y2  , . . . , r (xR) -r  ( t]c_1)  >yk } 

where  k is  an  arbitrary  positive  integer  and  x^  <•••<  x^, 

^1 ' ’ * ' ' ^k  are  arbitrarY  nonnegative  real  numbers.  It  can  be  shown 
that  the  distribution  of  the  process  r(t,w)  is  uniquely  deter- 
mined by  the  probabilities  given  to  sets  of  the  form  A = r-1(B). 
Thus  it  suffices  to  show  that  the  posterior  probability  of  sets  of 
the  form  A = r 1 (B)  equals  that  assigned  by  an  extended  gamma 
process  with  parameter  functions  at  ( • ) and  B(*). 

Defining  rn(t)  as  in  (2.5)  and  An  = rn'(B),  then 

rn(t)iiLvr(t)  and  Ia  Ift.  Thus 


i 
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P(r(t)eB  | X 2 x)  = P (r  ( t)  eB  ,X  > x) /P  (X  > x) 

= / exp[-  / r(t)  dt]dP(u))//  exp  [-  J r (t ) dt  ] dP  (w  ) 

A [0 , x)  fi  [0,x) 

= lim  / exp  [-  / r (t)  dt ] I (a) ) dP  (u) ) /lim  /exp[-  / r (t)dt]dP(i*)) 
n-***  [0,x)  n n+«  Q [0,x) 

+-Aa. 

= lim/  exp  [-EB^  (x-t^ ) +u^  ] IIg  (u^  , Aou  ) du^/lim  n[l+B^  (x-t  J ] by(7.4) 


n-*» 


Bn(u,B) 


= lim  / n g(u^;Aa^,  1+B^ (x-t^) +) du^ 


n-K« 


Bn(u,B) 


= lim  / n g(v.;Aai)dvi  where  Biui  = Bivi 

Bn(v,B) 

= F (B ; T (a , B ) ) by  (7.2)  of  Lemma  7.1. 

This  proves  the  theorem  for  m = 1.  Using  a parenthesized  subscript 
to  emphasize  explicitly  the  dependence  of  B on  the  sample  size, 
we  have  B^  (t)/[l+f3^  (t)  (Xj  + 1“t)+]  = 3(j+1)(t).  The  theorem 
follows  by  induction. 

PROOF  OF  THEOREM  3.3:  First  consider  the  case  m = 1.  It  suffices 
to  consider  sets  of  the  form  B where  B is  defined  by 
(7.6)  and  show  that 


(7.7) 


/ <t>(B,x)  f ( x ) dx  = P(r(t)eB,  X > x) 

(x,») 


where 


(7.8) 


f(x)  = - 


£ F(x> 


- / exp  [-  / r (s)  ds]  dP  (u>) 


[0 , x) 


is  the  marginal  density  of  X and 
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(7.9)  *(B,x)  = / § (s ) F (B ; F (a+I  , «/§))da(s)/  / 6(s)da(s) 

[0 , x)  ls'  1 [0 ,x) 

is  the  conditional  distribution  of  the  process.  To  show  (7.7)  we 
define  <J>n(B,x)  and  fn(x)  below  by  (7.11)  and  (7.12)  respectively 
and  prove  the  following  series  of  claims: 

Claim  1:  (7.10)  [ 4>_(B,x)f  (x)dx=  [ exp  [- f r (s)  ds]  I.  (w)  dP  (u) 

[x,»)  n ft  [0 ,x) n An 

where  rn  is  defined  by  (2.5)  and  = r'^B). 


Claim  2:  The  r.h.s.  of  (7.10)  converges  to  P(r(t)eB,  X>x)  as  n -*- 00  . 


Claim  3:  The  l.h.s.  of  (7.10)  converges  to  / <f>  (B , x)  f (x)  dx  as 

[x,°°) 

n-H». 


We  define 


(7.11)  <p  exp  [-  / r (s)ds]I  (oj)dP(u))/-^-/exp[-  / r (s)  ds]  dP  (o>) 

aXQ  [0,x)n  An  dxn  [0  ,x)n 


and 

(7.12)  f (x)  = - — / exp  [-/  r (s)  ds]  dP(w)  . 

n “ a [0,  x)  n 


Claim  1 is  a direct  consequence  of  the  above  definitions. 

Claim  2 follows  since  rn  ( s ) a * s *>  r(s),  Ift  (to)^-^-  Ift(w)  and  by 

n 

LDCT  the  r.h.s.  of  (7.10)  converges  to  / exp  [-  / r(s)ds]I  (u))dP(oj)  = 

ft  [0  f x) 

P(r(t)eB,  X > x)  . To  prove  claim  3 we  show  below  that  (i)  fn(x)  f (x) 
(ii)  4>n  ( B, x)  fn  (x) -*■(})  (B  ,x)  f (x)  , observe  from  (7.13)  and  (7.15)  that 
0 < 4>n  (B,x)  fn  (x)  < fR(x)  and  note  from  the  definitions  (7.8)  and 

(7.12)  that  / f (x)  dx-*-  / f(x)dx.  Thus  claim  3 follows  by  a 

[x,~)n  [x , 00 ) 


— I 
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(i)  To  show  f (x)  -*-f  (x)  , we  use  (7.4)  to  write 

, . ~Aa. 

f (x)  = ~[n(l+g.  (x  - t.)  ) x] 
n ax  1 i 

-Aa.-l  k(n)  + -Aa. 

= E[Aai(l+3i(x-t.)  ) 1 3.I[0fXfti)^n(l+Bj(x-tj)  ) 3] 

+ -Aai 

- ina+Sjlx-t.)  ) ]u  6i 

->exp  [-  / ln(l+8(t)  (x-t)+)da(t)  ] / B(t)da(t) 

[0 ,°°)  t0,x) 

= - A exp[-  / In ( 1+0 (t) (x-t) +) da ( t) ] 

dx  [o;»> 

(7.13)  = f(x)  by  (7.8)  and  Theorem  3.1. 

(ii)  To  show  4>n(B,x)  fn(x)->-  <J>  (B  ,x)  f (x)  , consider 

(7.14)  6 (B  ,x)  f (x)  = ~ -A  f exp  l-  j r is)  ds)  I (w)dP(w). 

n n ax  ft  [0,x)  n n 


The  derivative  of  the  integrand  in  (7.14)  is  -r  (x)exp[-  / r (s)dsl 

n [0  , x) n 

which  is  nonpositive  and  bounded  below  by  the  integrable  function 
~rn(tk(n)  ,U))  ’ Thus 

<On(B,x)f  (x)  = / r (x)  exp  [-  / r (s)ds]I  (u>)  dP  (co) 
a n [ 0 ,x) n An 

% (u  ftW  I-Mi-i  <*- V + ^ 

n v ' 

k(n) 

= Z Irn  vi(ti)  / 6 .u  exp  [-EB . u . (x-t . ) +] ng  (u. ; Aa . ) du. 

j=l  l0'x)  3 Bn (0,3)3  3 iii^iii 


-33- 


k(n)  Aa.  + 

= £ B.Irn  it .)/  (u  Dexp [-u. (1+6 . (x-t  ) ] /r ( Aa . ) }du 

! 3 I0»x)  3 ^ (u, 3) 3 333  33 

J n ^ 


Aa.-l  + 

H {u.  exp  [-uu  ( 1+3.^  (x-t^)  ) ] /r  ( AaJ  }du^ 

i*  j 1 


-Aa.  k(n^ 

(7.15)  =[11(1+3  .(x-t.)  ) x]{  Z 3.Irn  Jt.)Aa.  / gjv.  ;Aa.+l)dv.IIg(v.  ;Aa.  )dv.  } 

j'1'1'’3  V;*'8’3  3 3 


where  3^u^  = 3^v^*  The  term  in  the  brackets  converges  to 

exp [-  / ln(l  + 3(s) (x-s) +) da (s) J and  the  expression  in  the  braces 

[0,“) 

can  be  written  as  / {£  3-F(B;r(a  + I,  . ,3))I,.  . (s)}da(s). 

[0,x)  1 n lV°°J  (ti-l'V 

Since  a(s)  and  3(s)  are  bounded  for  s e [0,x]  and  the  integrand 
converges  to  3(s)  F (B  ; T ( a + I ^ ^ , 3)  ) by  Lemma  7.1,  application 
of  LDCT  yields 


(7.16)  (f>  (B,x)f  (x)+exp[-  / In  ( 1+3  (s)  (x-s ) +)da(s)  ] / 3(s)F(B,-r(a+I/  ,3))di(s) 

n n [0/x)  (0,x)  {s'°°> 


Thus  4>n  (B,x)  f (x)  -+4>  (B,x)  f (x)  since  the  limit  in  (7.16)  equals 


cf>  (A,  x)  f (x)  by  Theorem  3.1,  (7.8)  and  (7.9).  This  concludes  the 
proof  for  m = 1.  For  m = 2,  a similar  proof  can  be  given.  The 

I 

posterior  distribution  after  the  first  observation  is  used  as  the 

I prior  for  the  second.  Thus  <p  (B,x)  and  f (x)  are  similarly 

n n -1 

defined  except  that  r(t,w)  is  distributed  as  a mixture  of  extend- 
ed gamma  processes.  The  detailed  computations  are  more  cumbersome. 

I 

I 
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One  can  use  a generalization  of  an  unsymmetric  Fubini  theorem  given 
by  Cameron  and  Martin  (1941)  to  interchange  the  order  of  certain 
integrals  that  are  encountered.  Using  LDCT  and  the  result  proved 
for  m=l,  one  arrives  at  the  result  for  m = 2.  The  proof  for 
arbitrary  m follows  by  mathematical  induction. 

) 
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